Introduction

Star Wars Oxygen

I love Star Wars. I love the story telling and fantasy, but I especially love the music. John Williams is amazing. There was a podcast out there called Star Wars Oxygen that covered the music of Star Wars and it was one of my favorite podcasts of all time. Jimmy Mac hosted while voice actor, musician, and composer David W. Collins broke down the scores for the films we know and love in a way that gave me a new appreciation for the films. I say there was a podcast because the podcast went dark following the release of Rogue One. After 38 wonderful volumes the podcast just wasn’t updated any more and we the fans have not heard anything about why they stopped producing the show.

Species diversity

I also love statistics and ecology, which is the study of how organisms relate to each other and their environments. One exciting area of research deals with diversity. We can use statistics to figure out how many things live in a certain area and compare how different habitats are similar or different to one another. In order to conduct an analysis like this you need a “count matrix,” which has habitats on rows and species in columns. The cells are filled in with counts of how many of each species is found in each habitat. An example of a count matrix could look like this:

Example of a count matrix where each row represents a habitat and each column represents a species. The cells are filled in with counts of the number of each species observed at each habitat.
Danaus plexippus Vanessa cardui Adelpha bredowii
Donner Pass 5 6 0
Sierraville 4 2 2
Davis 0 0 3

In this example, we can see that Donner Pass and Sierraville are similar to each other for two species. Also Davis and Sierraville are somewhat similar to each other because they have one species in common. If we were going to group these sites based on similarity, Donner Pass and Sierraville would be more similar to each other than to Davis.

If we plot these relationships as a tree (after some statistical wankery) we see that Donner Pass and Sierraville appear close together with Davis far apart from them.

Cluster plot of the toy example referred to above.

Cluster plot of the toy example referred to above.

Please note here that I have created this page using RMardown in RStudio. All of the code and data used to create this post are freely available through this project’s github repository.

Star Wars musical ecology

During the Star Wars Oxygen podcast, David W. Collins began what he called his “theme tracker,” which was essential a spreadsheet of the number of times a theme played per film.

David W. Collins made a count matrix.

We can use statistics on count matrices.

We can apply statistics to Star Wars!!!! Oh happy day!!!

The data

To reverse engineer the theme tracker I listened back through all of the Star Wars Oxygen episode with pencil and paper ready. I made note of how often a theme was played during a particular film every time Mr. Collins mentioned it. In some instances, I had to get a bit of help and I read the breakdowns and threads from these sites:

This was especially helpful when going through Attack of the Clones, which had a lot of music edits.

I then attempted my own impression of David W. Collins and Star Wars Oxygen and went through Rogue One three times and counted each instance of what I thought was a “theme.” I am almost certainly wrong because I am not a trained musician and I might have considered themes to be separate when they were they were actually part of the same leitmotifs.

The data I ended up with, and which are used here had:

  • 8 rows - one for each film (“ecosystem”)
  • 48 columns - one for each theme (“species”)

These data could be wrong or incomplete and are in need of improvement. I am particularly concerned by the lack of “rare” themes in the data set. Rare things are important in ecology. There are a few ways you could contribute:

General Plots

All the themes

Let’s make a histogram where the total number of appearances each theme makes in the saga is plotted. Hover your cursor over each bar to see what it represents.

Plot of all theme appearances

Themes by Film

Let’s make a plot where each film is represented by a bar and that bar is filled according to the frequency of the themes in that movie. Hover your cursor over a bar to see the theme and number of times it appeared in that film. Try clicking on compare data on hover to see all the themes at once.

Themes by film

Analysis

Clustering

Now we’ll make a tree depicting the relationships between the seven films of the Star Wars saga just as we did in the above example.

A prediction on the clustering analysis. The three original trilogy films will cluster together separate from the prequel trilogy films, which will also cluster together. I think that The Force Awakens and Rogue One will be more similar, musically, to the original trilogy than the prequels.

Adding the data from Rogue One allows us to see where that film lies in relation to the others. Michael Giacchino rooted the music for Rogue One firmly within Star Wars. He used parts from A New Hope to form the themes used in Rogue One, for example Jyn Erso’s Suite was based on “the Message,” which plays in the background with Obi Wan says “You must learn the ways of the Force….” It is also the only Star Wars film to share “Darth Vader’s” theme with A New Hope.

Clustering of the Star Wars films based on the their musical theme counts.

Clustering of the Star Wars films based on the their musical theme counts.

This plot shows Rogue One as more musically related to the original trilogy, which makes a lot of sense to me. However, the themes shared between A Hew Hope & The Forece Awakens and The Empire Strikes Back & Return of the Jedi create similarities that are too strong for Rogue One to break.

Jost’s D

Here we count the number of different themes and consider how many different themes there are if we weight “rarity.”

Plot of the effective number of themes by Star Wars film

Plot of the effective number of themes by Star Wars film

To read this plot we look at the y (vertical) axis to see the number of themes. Along the x (horizontal) axis we have the different weights we place on “rarity.” A weight of 0 means that all themes are equal and it represents the actual number of themes present in each film. As we move right along the x axis we decrease the effect that a rarely occurring theme has on the y axis value. All the way to the right we hardly consider the effect that rare themes have on the effective number of themes.

Note that A New Hope has the fewest number of themes (when q = 0). This is likely a result of incomplete data in the spreadsheet or could be reflective of the fact that it is the first films. Rogue One actually have the highest number of themes, but from the John Williams scored films Revenge of the Sith has the most themes in our data set. One thing that appears evident from this analysis, is that all films have ~6 themes frequently used throughout.

One last note of geekery. The colors from that plot were made with an R package called spaceMovie that uses colors from the Star Wars films.

NMDS

Lastly, I want to employ a method called NMDS (Non-Metric MultiDimensional Scaling) which plots the locations of each “habitat” in ordination space. This doesn’t mean much except to say that similar things should be closer together than dissimilar things.

NMDS Ordination plot of the Star Wars films.

NMDS Ordination plot of the Star Wars films.

Think about which films you could draw an ellipse around without including any other films. We could draw a circle around the prequels that only contains the prequels. This suggests that the prequel films are closer to each other than they are to other films. It also appears that The Force Awakens is closest to the original trilogy. These findings are consistent with the clustering plot we saw earlier. What is really cool about this plot is we see just how far away Rogue One is from the rest of the films.

Conclusions

I have three big takeaways about the music of the Star Wars films from this exercise:

  1. The original trilogy films are most similar to each other.
  2. The prequel trilogy films are most similar to each other.
  3. The Force Awakens is more similar to the original trilogy than to the prequels.
  4. Rogue One is its own thing.

These results make a lot of sense to me. I interpret these results to mean that John Williams kept similar themes throughout each of the two trilogies, and that The Force Awakens is building off of the original trilogy, which makes chronological sense. Lastly, Michael Giacchino used themes found in A New Hope to ground Rogue One in the Star Wars musical universe.